Optimization of sub-band weights using simulated noisy speech in multi-band speech recognition
نویسندگان
چکیده
Recently multi-band speech recognition has been proposed to improve robustness under environmental noises. One important issue is how to combine decisions from individual sub-band recognizers to arrive at a nal decision. Under the hidden Markov modeling (HMM) framework, one common approach is combining sub-band likelihoods linearly in an optimal manner so that the more reliable sub-bands are emphasized and the corrupted sub-bands are de-emphasized. In our experience, estimating the weights from clean speech is not e ective as the weights are not optimal under noisy environments. In this paper, we derive the optimal weights from simulated noisy speech using discriminative training method with minimum classi cation errors (MCE) or maximum mutual information (MMI) as the cost function. The methods are evaluated on recognition of isolated TI digits. Compared with full-band recognition with noises at an SNR of 0dB, multiband recognition with MCE-derived weights reduces word errors by 45.9% on a tone noise, and an average of 17.9% on three real noises. MCE-derived weights and MMI-derived weights have similar performance, and are much better than weights derived from other means.
منابع مشابه
Multi-band speech recognition in noisy environments
This paper presents a new approachfor multi-band based automatic speech recognition (ASR). Recent work by Bourlard and Hermansky suggests that multi-band ASR gives more accurate recognition, especially in noisy acoustic environments, by combining the likelihoods of different frequency bands. Here we evaluate this likelihood recombination (LC) approach to multi-band ASR, and propose an alternati...
متن کاملDevelopment of an asynchronous multi-band system for continuous speech recognition
Recently, multi-band automatic speech recognition (MBASR) is proposed to combat environmental noises. In this paper, we describe the two major efforts in the development of our asynchronous MBASR system for continuous speech recognition. Firstly, we successfully introduce asynchrony among sub-bands under the HMM composition framework. An asynchrony limit of one state is found adequate — relaxin...
متن کاملAsynchrony with trained transition probabilities improves performance in multi-band speech recognition
One of the central themes in multi-band automatic speech recognition (ASR) is to devise a strategy for recombining sub-band information. This in turn raises two questions: (1) at what phonetic unit should the recombination take place? (2) How asynchronously should the sub-bands be run? Theoretically asynchronous multi-band ASR should perform at least as well as synchronous multi-band ASR. Howev...
متن کاملFrom Multi-Band Full Combination to Multi-Stream Full Combination Processing in Robust ASR
The multi-band processing paradigm for noise robust ASR was originally motivated by the observation that human recognition appears to be based on independent processing of separate frequency sub-bands, and also by “missing data” results which have shown that ASR can be made significantly more robust to band-limited noise if noisy sub-bands can be detected and then ignored. Of the different mult...
متن کاملDiscriminative weighting of multi-resolution sub-band cepstral features for speech recognition
This paper explores possible strategies for the recombination of independent multi-resolution sub-band based recognisers. The multi-resolution approach is based on the premise that additional cues for phonetic discrimination may exist in the spectral correlates of a particular sub-band, but not in another. Weights are derived via discriminative training using the ‘Minimum Classification Error’ ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000